In [ ]:
import keras
keras.__version__
Out[ ]:
'2.13.1'

5.2 - Using convnets with small datasets¶

This notebook contains the code sample found in Chapter 5, Section 2 of Deep Learning with Python. Note that the original text features far more content, in particular further explanations and figures: in this notebook, you will only find source code and related comments.

Training a convnet from scratch on a small dataset¶

Having to train an image classification model using only very little data is a common situation, which you likely encounter yourself in practice if you ever do computer vision in a professional context.

Having "few" samples can mean anywhere from a few hundreds to a few tens of thousands of images. As a practical example, we will focus on classifying images as "dogs" or "cats", in a dataset containing 4000 pictures of cats and dogs (2000 cats, 2000 dogs). We will use 2000 pictures for training, 1000 for validation, and finally 1000 for testing.

In this section, we will review one basic strategy to tackle this problem: training a new model from scratch on what little data we have. We will start by naively training a small convnet on our 2000 training samples, without any regularization, to set a baseline for what can be achieved. This will get us to a classification accuracy of 71%. At that point, our main issue will be overfitting. Then we will introduce data augmentation, a powerful technique for mitigating overfitting in computer vision. By leveraging data augmentation, we will improve our network to reach an accuracy of 82%.

In the next section, we will review two more essential techniques for applying deep learning to small datasets: doing feature extraction with a pre-trained network (this will get us to an accuracy of 90% to 93%), and fine-tuning a pre-trained network (this will get us to our final accuracy of 95%). Together, these three strategies -- training a small model from scratch, doing feature extracting using a pre-trained model, and fine-tuning a pre-trained model -- will constitute your future toolbox for tackling the problem of doing computer vision with small datasets.

The relevance of deep learning for small-data problems¶

You will sometimes hear that deep learning only works when lots of data is available. This is in part a valid point: one fundamental characteristic of deep learning is that it is able to find interesting features in the training data on its own, without any need for manual feature engineering, and this can only be achieved when lots of training examples are available. This is especially true for problems where the input samples are very high-dimensional, like images.

However, what constitutes "lots" of samples is relative -- relative to the size and depth of the network you are trying to train, for starters. It isn't possible to train a convnet to solve a complex problem with just a few tens of samples, but a few hundreds can potentially suffice if the model is small and well-regularized and if the task is simple. Because convnets learn local, translation-invariant features, they are very data-efficient on perceptual problems. Training a convnet from scratch on a very small image dataset will still yield reasonable results despite a relative lack of data, without the need for any custom feature engineering. You will see this in action in this section.

But what's more, deep learning models are by nature highly repurposable: you can take, say, an image classification or speech-to-text model trained on a large-scale dataset then reuse it on a significantly different problem with only minor changes. Specifically, in the case of computer vision, many pre-trained models (usually trained on the ImageNet dataset) are now publicly available for download and can be used to bootstrap powerful vision models out of very little data. That's what we will do in the next section.

For now, let's get started by getting our hands on the data.

Downloading the data¶

The cats vs. dogs dataset that we will use isn't packaged with Keras. It was made available by Kaggle.com as part of a computer vision competition in late 2013, back when convnets weren't quite mainstream. You can download the original dataset at: https://www.kaggle.com/c/dogs-vs-cats/data (you will need to create a Kaggle account if you don't already have one -- don't worry, the process is painless).

The pictures are medium-resolution color JPEGs. They look like this:

cats_vs_dogs_samples

Unsurprisingly, the cats vs. dogs Kaggle competition in 2013 was won by entrants who used convnets. The best entries could achieve up to 95% accuracy. In our own example, we will get fairly close to this accuracy (in the next section), even though we will be training our models on less than 10% of the data that was available to the competitors. This original dataset contains 25,000 images of dogs and cats (12,500 from each class) and is 543MB large (compressed). After downloading and uncompressing it, we will create a new dataset containing three subsets: a training set with 1000 samples of each class, a validation set with 500 samples of each class, and finally a test set with 500 samples of each class.

Here are a few lines of code to do this:

In [ ]:
import os, shutil
In [ ]:
import cv2
from google.colab import drive

drive.mount("/content/gdrive")

#img = cv2.imread('/content/gdrive/MyDrive/ColabNotebooks/sizedgrumpycat.jpg', cv2.IMREAD_GRAYSCALE)
Mounted at /content/gdrive
In [ ]:
# The path to the directory where the original
# dataset was uncompressed
#original_dataset_dir = '/Users/fchollet/Downloads/kaggle_original_data'
original_dataset_dir = '/content/gdrive/MyDrive/ColabNotebooks/dogs-vs-cats/train'

# The directory where we will
# store our smaller dataset
#base_dir = '/Users/fchollet/Downloads/cats_and_dogs_small'
base_dir = '/content/gdrive/MyDrive/ColabNotebooks/dogs-vs-cats/cats_and_dogs_small'

train_dir = os.path.join(base_dir, 'train')
validation_dir = os.path.join(base_dir, 'validation')
test_dir = os.path.join(base_dir, 'test')
train_cats_dir = os.path.join(train_dir, 'cats')
train_dogs_dir = os.path.join(train_dir, 'dogs')
validation_cats_dir = os.path.join(validation_dir, 'cats')
validation_dogs_dir = os.path.join(validation_dir, 'dogs')
test_cats_dir = os.path.join(test_dir, 'cats')
test_dogs_dir = os.path.join(test_dir, 'dogs')

print(train_dir, "\n", train_cats_dir,  "\n", train_dogs_dir)
print(validation_dir,  "\n", validation_cats_dir ,  "\n",validation_cats_dir )
print(test_dir ,  "\n", test_cats_dir , "\n", test_dogs_dir  )
/content/gdrive/MyDrive/ColabNotebooks/dogs-vs-cats/cats_and_dogs_small/train 
 /content/gdrive/MyDrive/ColabNotebooks/dogs-vs-cats/cats_and_dogs_small/train/cats 
 /content/gdrive/MyDrive/ColabNotebooks/dogs-vs-cats/cats_and_dogs_small/train/dogs
/content/gdrive/MyDrive/ColabNotebooks/dogs-vs-cats/cats_and_dogs_small/validation 
 /content/gdrive/MyDrive/ColabNotebooks/dogs-vs-cats/cats_and_dogs_small/validation/cats 
 /content/gdrive/MyDrive/ColabNotebooks/dogs-vs-cats/cats_and_dogs_small/validation/cats
/content/gdrive/MyDrive/ColabNotebooks/dogs-vs-cats/cats_and_dogs_small/test 
 /content/gdrive/MyDrive/ColabNotebooks/dogs-vs-cats/cats_and_dogs_small/test/cats 
 /content/gdrive/MyDrive/ColabNotebooks/dogs-vs-cats/cats_and_dogs_small/test/dogs
In [ ]:
####Execute only once

'''
#os.mkdir(base_dir)

# Directories for our training,
# validation and test splits

#os.mkdir(train_dir)

#os.mkdir(validation_dir)

#os.mkdir(test_dir)

# Directory with our training cat pictures

#os.mkdir(train_cats_dir)

## Directory with our training dog pictures

#os.mkdir(train_dogs_dir)

## Directory with our validation cat pictures

#os.mkdir(validation_cats_dir)

# Directory with our validation dog pictures

#os.mkdir(validation_dogs_dir)

# Directory with our validation cat pictures

#os.mkdir(test_cats_dir)

## Directory with our validation dog pictures

#os.mkdir(test_dogs_dir)

## Copy first 1000 cat images to train_cats_dir
#fnames = ['cat.{}.jpg'.format(i) for i in range(1000)]
#for fname in fnames:
#    src = os.path.join(original_dataset_dir, fname)
#    dst = os.path.join(train_cats_dir, fname)
#    shutil.copyfile(src, dst)

# Copy next 500 cat images to validation_cats_dir
fnames = ['cat.{}.jpg'.format(i) for i in range(1000, 1500)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(validation_cats_dir, fname)
    shutil.copyfile(src, dst)

# Copy next 500 cat images to test_cats_dir
fnames = ['cat.{}.jpg'.format(i) for i in range(1500, 2000)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(test_cats_dir, fname)
    shutil.copyfile(src, dst)

# Copy first 1000 dog images to train_dogs_dir
fnames = ['dog.{}.jpg'.format(i) for i in range(1000)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(train_dogs_dir, fname)
    shutil.copyfile(src, dst)

# Copy next 500 dog images to validation_dogs_dir
fnames = ['dog.{}.jpg'.format(i) for i in range(1000, 1500)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(validation_dogs_dir, fname)
    shutil.copyfile(src, dst)

# Copy next 500 dog images to test_dogs_dir
fnames = ['dog.{}.jpg'.format(i) for i in range(1500, 2000)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(test_dogs_dir, fname)
    shutil.copyfile(src, dst)

  '''
Out[ ]:
"\n#os.mkdir(base_dir)\n\n# Directories for our training,\n# validation and test splits\n\n#os.mkdir(train_dir)\n\n#os.mkdir(validation_dir)\n\n#os.mkdir(test_dir)\n\n# Directory with our training cat pictures\n\n#os.mkdir(train_cats_dir)\n\n## Directory with our training dog pictures\n\n#os.mkdir(train_dogs_dir)\n\n## Directory with our validation cat pictures\n\n#os.mkdir(validation_cats_dir)\n\n# Directory with our validation dog pictures\n\n#os.mkdir(validation_dogs_dir)\n\n# Directory with our validation cat pictures\n\n#os.mkdir(test_cats_dir)\n\n## Directory with our validation dog pictures\n\n#os.mkdir(test_dogs_dir)\n\n## Copy first 1000 cat images to train_cats_dir\n#fnames = ['cat.{}.jpg'.format(i) for i in range(1000)]\n#for fname in fnames:\n#    src = os.path.join(original_dataset_dir, fname)\n#    dst = os.path.join(train_cats_dir, fname)\n#    shutil.copyfile(src, dst)\n\n# Copy next 500 cat images to validation_cats_dir\nfnames = ['cat.{}.jpg'.format(i) for i in range(1000, 1500)]\nfor fname in fnames:\n    src = os.path.join(original_dataset_dir, fname)\n    dst = os.path.join(validation_cats_dir, fname)\n    shutil.copyfile(src, dst)\n    \n# Copy next 500 cat images to test_cats_dir\nfnames = ['cat.{}.jpg'.format(i) for i in range(1500, 2000)]\nfor fname in fnames:\n    src = os.path.join(original_dataset_dir, fname)\n    dst = os.path.join(test_cats_dir, fname)\n    shutil.copyfile(src, dst)\n    \n# Copy first 1000 dog images to train_dogs_dir\nfnames = ['dog.{}.jpg'.format(i) for i in range(1000)]\nfor fname in fnames:\n    src = os.path.join(original_dataset_dir, fname)\n    dst = os.path.join(train_dogs_dir, fname)\n    shutil.copyfile(src, dst)\n    \n# Copy next 500 dog images to validation_dogs_dir\nfnames = ['dog.{}.jpg'.format(i) for i in range(1000, 1500)]\nfor fname in fnames:\n    src = os.path.join(original_dataset_dir, fname)\n    dst = os.path.join(validation_dogs_dir, fname)\n    shutil.copyfile(src, dst)\n    \n# Copy next 500 dog images to test_dogs_dir\nfnames = ['dog.{}.jpg'.format(i) for i in range(1500, 2000)]\nfor fname in fnames:\n    src = os.path.join(original_dataset_dir, fname)\n    dst = os.path.join(test_dogs_dir, fname)\n    shutil.copyfile(src, dst)\n\n  "

As a sanity check, let's count how many pictures we have in each training split (train/validation/test):

In [ ]:
print('total training cat images:', len(os.listdir(train_cats_dir)))
print('total training dog images:', len(os.listdir(train_dogs_dir)))
print('total validation cat images:', len(os.listdir(validation_cats_dir)))
print('total validation dog images:', len(os.listdir(validation_dogs_dir)))
print('total test cat images:', len(os.listdir(test_cats_dir)))
print('total test dog images:', len(os.listdir(test_dogs_dir)))
total training cat images: 1000
total training dog images: 1000
total validation cat images: 500
total validation dog images: 500
total test cat images: 500
total test dog images: 500

So we have indeed 2000 training images, and then 1000 validation images and 1000 test images. In each split, there is the same number of samples from each class: this is a balanced binary classification problem, which means that classification accuracy will be an appropriate measure of success.

Building our network¶

We've already built a small convnet for MNIST in the previous example, so you should be familiar with them. We will reuse the same general structure: our convnet will be a stack of alternated Conv2D (with relu activation) and MaxPooling2D layers.

However, since we are dealing with bigger images and a more complex problem, we will make our network accordingly larger: it will have one more Conv2D + MaxPooling2D stage. This serves both to augment the capacity of the network, and to further reduce the size of the feature maps, so that they aren't overly large when we reach the Flatten layer. Here, since we start from inputs of size 150x150 (a somewhat arbitrary choice), we end up with feature maps of size 7x7 right before the Flatten layer.

Note that the depth of the feature maps is progressively increasing in the network (from 32 to 128), while the size of the feature maps is decreasing (from 148x148 to 7x7). This is a pattern that you will see in almost all convnets.

Since we are attacking a binary classification problem, we are ending the network with a single unit (a Dense layer of size 1) and a sigmoid activation. This unit will encode the probability that the network is looking at one class or the other.

In [ ]:
from keras import layers
from keras import models

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu',
                        input_shape=(150, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))

Let's take a look at how the dimensions of the feature maps change with every successive layer:

In [ ]:
model.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 148, 148, 32)      896       
                                                                 
 max_pooling2d (MaxPooling2  (None, 74, 74, 32)        0         
 D)                                                              
                                                                 
 conv2d_1 (Conv2D)           (None, 72, 72, 64)        18496     
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 36, 36, 64)        0         
 g2D)                                                            
                                                                 
 conv2d_2 (Conv2D)           (None, 34, 34, 128)       73856     
                                                                 
 max_pooling2d_2 (MaxPoolin  (None, 17, 17, 128)       0         
 g2D)                                                            
                                                                 
 conv2d_3 (Conv2D)           (None, 15, 15, 128)       147584    
                                                                 
 max_pooling2d_3 (MaxPoolin  (None, 7, 7, 128)         0         
 g2D)                                                            
                                                                 
 flatten (Flatten)           (None, 6272)              0         
                                                                 
 dense (Dense)               (None, 512)               3211776   
                                                                 
 dense_1 (Dense)             (None, 1)                 513       
                                                                 
=================================================================
Total params: 3453121 (13.17 MB)
Trainable params: 3453121 (13.17 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________

For our compilation step, we'll go with the RMSprop optimizer as usual. Since we ended our network with a single sigmoid unit, we will use binary crossentropy as our loss (as a reminder, check out the table in Chapter 4, section 5 for a cheatsheet on what loss function to use in various situations).

In [ ]:
from keras import optimizers

model.compile(loss='binary_crossentropy',
              optimizer=optimizers.RMSprop(lr=1e-4),
              metrics=['acc'])
WARNING:absl:`lr` is deprecated in Keras optimizer, please use `learning_rate` or use the legacy optimizer, e.g.,tf.keras.optimizers.legacy.RMSprop.

Data preprocessing¶

As you already know by now, data should be formatted into appropriately pre-processed floating point tensors before being fed into our network. Currently, our data sits on a drive as JPEG files, so the steps for getting it into our network are roughly:

  • Read the picture files.
  • Decode the JPEG content to RBG grids of pixels.
  • Convert these into floating point tensors.
  • Rescale the pixel values (between 0 and 255) to the [0, 1] interval (as you know, neural networks prefer to deal with small input values).

It may seem a bit daunting, but thankfully Keras has utilities to take care of these steps automatically. Keras has a module with image processing helper tools, located at keras.preprocessing.image. In particular, it contains the class ImageDataGenerator which allows to quickly set up Python generators that can automatically turn image files on disk into batches of pre-processed tensors. This is what we will use here.

In [ ]:
from keras.preprocessing.image import ImageDataGenerator

# All images will be rescaled by 1./255
train_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
        # This is the target directory
        train_dir,
        # All images will be resized to 150x150
        target_size=(150, 150),
        batch_size=20,
        # Since we use binary_crossentropy loss, we need binary labels
        class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
        validation_dir,
        target_size=(150, 150),
        batch_size=20,
        class_mode='binary')
Found 2000 images belonging to 2 classes.
Found 1000 images belonging to 2 classes.

Let's take a look at the output of one of these generators: it yields batches of 150x150 RGB images (shape (20, 150, 150, 3)) and binary labels (shape (20,)). 20 is the number of samples in each batch (the batch size). Note that the generator yields these batches indefinitely: it just loops endlessly over the images present in the target folder. For this reason, we need to break the iteration loop at some point.

In [ ]:
for data_batch, labels_batch in train_generator:
    print('data batch shape:', data_batch.shape)
    print('labels batch shape:', labels_batch.shape)
    break
data batch shape: (20, 150, 150, 3)
labels batch shape: (20,)

Let's fit our model to the data using the generator. We do it using the fit_generator method, the equivalent of fit for data generators like ours. It expects as first argument a Python generator that will yield batches of inputs and targets indefinitely, like ours does. Because the data is being generated endlessly, the generator needs to know example how many samples to draw from the generator before declaring an epoch over. This is the role of the steps_per_epoch argument: after having drawn steps_per_epoch batches from the generator, i.e. after having run for steps_per_epoch gradient descent steps, the fitting process will go to the next epoch. In our case, batches are 20-sample large, so it will take 100 batches until we see our target of 2000 samples.

When using fit_generator, one may pass a validation_data argument, much like with the fit method. Importantly, this argument is allowed to be a data generator itself, but it could be a tuple of Numpy arrays as well. If you pass a generator as validation_data, then this generator is expected to yield batches of validation data endlessly, and thus you should also specify the validation_steps argument, which tells the process how many batches to draw from the validation generator for evaluation.

In [ ]:
history = model.fit_generator(
      train_generator,
      steps_per_epoch=100,
      epochs=30,
      validation_data=validation_generator,
      validation_steps=50)
<ipython-input-13-a7acfc8093a4>:1: UserWarning: `Model.fit_generator` is deprecated and will be removed in a future version. Please use `Model.fit`, which supports generators.
  history = model.fit_generator(
Epoch 1/30
100/100 [==============================] - 969s 10s/step - loss: 0.6968 - acc: 0.4925 - val_loss: 0.6987 - val_acc: 0.5000
Epoch 2/30
100/100 [==============================] - 7s 69ms/step - loss: 0.6917 - acc: 0.5265 - val_loss: 0.6709 - val_acc: 0.5700
Epoch 3/30
100/100 [==============================] - 7s 68ms/step - loss: 0.6728 - acc: 0.6025 - val_loss: 0.6463 - val_acc: 0.6460
Epoch 4/30
100/100 [==============================] - 7s 70ms/step - loss: 0.6371 - acc: 0.6480 - val_loss: 0.6121 - val_acc: 0.6640
Epoch 5/30
100/100 [==============================] - 7s 66ms/step - loss: 0.6004 - acc: 0.6705 - val_loss: 0.6022 - val_acc: 0.6550
Epoch 6/30
100/100 [==============================] - 7s 69ms/step - loss: 0.5443 - acc: 0.7235 - val_loss: 0.5619 - val_acc: 0.7170
Epoch 7/30
100/100 [==============================] - 7s 67ms/step - loss: 0.4731 - acc: 0.7740 - val_loss: 0.6264 - val_acc: 0.7080
Epoch 8/30
100/100 [==============================] - 7s 66ms/step - loss: 0.4312 - acc: 0.8010 - val_loss: 0.6026 - val_acc: 0.7360
Epoch 9/30
100/100 [==============================] - 7s 68ms/step - loss: 0.3689 - acc: 0.8335 - val_loss: 0.5476 - val_acc: 0.7400
Epoch 10/30
100/100 [==============================] - 7s 67ms/step - loss: 0.2990 - acc: 0.8760 - val_loss: 0.6482 - val_acc: 0.7160
Epoch 11/30
100/100 [==============================] - 7s 68ms/step - loss: 0.2324 - acc: 0.8990 - val_loss: 0.7020 - val_acc: 0.7150
Epoch 12/30
100/100 [==============================] - 7s 67ms/step - loss: 0.1769 - acc: 0.9330 - val_loss: 0.8441 - val_acc: 0.7350
Epoch 13/30
100/100 [==============================] - 7s 67ms/step - loss: 0.1230 - acc: 0.9575 - val_loss: 1.1012 - val_acc: 0.7400
Epoch 14/30
100/100 [==============================] - 7s 68ms/step - loss: 0.0796 - acc: 0.9690 - val_loss: 1.5038 - val_acc: 0.7230
Epoch 15/30
100/100 [==============================] - 7s 67ms/step - loss: 0.0701 - acc: 0.9735 - val_loss: 1.3085 - val_acc: 0.7420
Epoch 16/30
100/100 [==============================] - 7s 68ms/step - loss: 0.0662 - acc: 0.9820 - val_loss: 1.5484 - val_acc: 0.7430
Epoch 17/30
100/100 [==============================] - 7s 67ms/step - loss: 0.0437 - acc: 0.9885 - val_loss: 1.5869 - val_acc: 0.7470
Epoch 18/30
100/100 [==============================] - 7s 66ms/step - loss: 0.0724 - acc: 0.9795 - val_loss: 1.6959 - val_acc: 0.7120
Epoch 19/30
100/100 [==============================] - 7s 67ms/step - loss: 0.0450 - acc: 0.9870 - val_loss: 1.6519 - val_acc: 0.7630
Epoch 20/30
100/100 [==============================] - 6s 64ms/step - loss: 0.0310 - acc: 0.9875 - val_loss: 2.6006 - val_acc: 0.7270
Epoch 21/30
100/100 [==============================] - 7s 69ms/step - loss: 0.0849 - acc: 0.9845 - val_loss: 1.4715 - val_acc: 0.7580
Epoch 22/30
100/100 [==============================] - 7s 67ms/step - loss: 0.0226 - acc: 0.9910 - val_loss: 1.8964 - val_acc: 0.7670
Epoch 23/30
100/100 [==============================] - 7s 69ms/step - loss: 0.0431 - acc: 0.9865 - val_loss: 2.2871 - val_acc: 0.7210
Epoch 24/30
100/100 [==============================] - 7s 67ms/step - loss: 0.0216 - acc: 0.9955 - val_loss: 3.5743 - val_acc: 0.6820
Epoch 25/30
100/100 [==============================] - 7s 68ms/step - loss: 0.0428 - acc: 0.9880 - val_loss: 2.6008 - val_acc: 0.7470
Epoch 26/30
100/100 [==============================] - 7s 68ms/step - loss: 0.0352 - acc: 0.9900 - val_loss: 2.8550 - val_acc: 0.7070
Epoch 27/30
100/100 [==============================] - 7s 67ms/step - loss: 0.0369 - acc: 0.9900 - val_loss: 2.6528 - val_acc: 0.7170
Epoch 28/30
100/100 [==============================] - 7s 69ms/step - loss: 0.0178 - acc: 0.9930 - val_loss: 4.0152 - val_acc: 0.7050
Epoch 29/30
100/100 [==============================] - 7s 66ms/step - loss: 0.0238 - acc: 0.9940 - val_loss: 2.8662 - val_acc: 0.7420
Epoch 30/30
100/100 [==============================] - 7s 70ms/step - loss: 8.2552e-05 - acc: 1.0000 - val_loss: 3.0803 - val_acc: 0.7350

It is good practice to always save your models after training:

In [ ]:
model.save('cats_and_dogs_small_Assign6_1.h5')
/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py:3000: UserWarning: You are saving your model as an HDF5 file via `model.save()`. This file format is considered legacy. We recommend using instead the native Keras format, e.g. `model.save('my_model.keras')`.
  saving_api.save_model(

Let's plot the loss and accuracy of the model over the training and validation data during training:

In [ ]:
import matplotlib.pyplot as plt

acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(len(acc))

plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()

plt.figure()

plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()

plt.show()

These plots are characteristic of overfitting. Our training accuracy increases linearly over time, until it reaches nearly 100%, while our validation accuracy stalls at 70-72%. Our validation loss reaches its minimum after only five epochs then stalls, while the training loss keeps decreasing linearly until it reaches nearly 0.

Because we only have relatively few training samples (2000), overfitting is going to be our number one concern. You already know about a number of techniques that can help mitigate overfitting, such as dropout and weight decay (L2 regularization). We are now going to introduce a new one, specific to computer vision, and used almost universally when processing images with deep learning models: data augmentation.

Problem #1¶

Added the L1 Regularizer (0.0001) on the Dense Layer with (512,1) output tensor.¶

In [ ]:
from keras import layers
from keras import models
from keras import regularizers

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu',
                        input_shape=(150, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(512, activation='relu', kernel_regularizer=regularizers.l1 (0.0001) ))
model.add(layers.Dense(1, activation='sigmoid'))

model.summary()
Model: "sequential_6"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d_16 (Conv2D)          (None, 148, 148, 32)      896       
                                                                 
 max_pooling2d_16 (MaxPooli  (None, 74, 74, 32)        0         
 ng2D)                                                           
                                                                 
 conv2d_17 (Conv2D)          (None, 72, 72, 64)        18496     
                                                                 
 max_pooling2d_17 (MaxPooli  (None, 36, 36, 64)        0         
 ng2D)                                                           
                                                                 
 conv2d_18 (Conv2D)          (None, 34, 34, 128)       73856     
                                                                 
 max_pooling2d_18 (MaxPooli  (None, 17, 17, 128)       0         
 ng2D)                                                           
                                                                 
 conv2d_19 (Conv2D)          (None, 15, 15, 128)       147584    
                                                                 
 max_pooling2d_19 (MaxPooli  (None, 7, 7, 128)         0         
 ng2D)                                                           
                                                                 
 flatten_4 (Flatten)         (None, 6272)              0         
                                                                 
 dense_8 (Dense)             (None, 512)               3211776   
                                                                 
 dense_9 (Dense)             (None, 1)                 513       
                                                                 
=================================================================
Total params: 3453121 (13.17 MB)
Trainable params: 3453121 (13.17 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________

Running the model for 15 epochs¶

In [ ]:
from keras import optimizers

model.compile(loss='binary_crossentropy',
              optimizer=optimizers.RMSprop(lr=1e-4),
              metrics=['acc'])

history = model.fit_generator(
      train_generator,
      steps_per_epoch=100,
      epochs=15,
      validation_data=validation_generator,
      validation_steps=50)
WARNING:absl:`lr` is deprecated in Keras optimizer, please use `learning_rate` or use the legacy optimizer, e.g.,tf.keras.optimizers.legacy.RMSprop.
Epoch 1/15
<ipython-input-23-91476fa0ec33>:7: UserWarning: `Model.fit_generator` is deprecated and will be removed in a future version. Please use `Model.fit`, which supports generators.
  history = model.fit_generator(
100/100 [==============================] - 8s 68ms/step - loss: 2.3935 - acc: 0.5005 - val_loss: 0.7605 - val_acc: 0.5000
Epoch 2/15
100/100 [==============================] - 7s 70ms/step - loss: 0.7444 - acc: 0.4735 - val_loss: 0.7415 - val_acc: 0.5000
Epoch 3/15
100/100 [==============================] - 7s 68ms/step - loss: 0.7493 - acc: 0.5075 - val_loss: 0.7522 - val_acc: 0.5000
Epoch 4/15
100/100 [==============================] - 7s 69ms/step - loss: 0.7430 - acc: 0.5285 - val_loss: 0.7408 - val_acc: 0.5000
Epoch 5/15
100/100 [==============================] - 8s 78ms/step - loss: 0.7429 - acc: 0.5265 - val_loss: 0.7420 - val_acc: 0.5010
Epoch 6/15
100/100 [==============================] - 7s 69ms/step - loss: 0.7363 - acc: 0.5415 - val_loss: 0.7332 - val_acc: 0.5690
Epoch 7/15
100/100 [==============================] - 7s 68ms/step - loss: 0.7230 - acc: 0.5920 - val_loss: 0.6950 - val_acc: 0.6270
Epoch 8/15
100/100 [==============================] - 7s 68ms/step - loss: 0.7110 - acc: 0.6160 - val_loss: 0.6754 - val_acc: 0.6600
Epoch 9/15
100/100 [==============================] - 7s 66ms/step - loss: 0.6977 - acc: 0.6390 - val_loss: 0.6824 - val_acc: 0.6540
Epoch 10/15
100/100 [==============================] - 7s 67ms/step - loss: 0.6808 - acc: 0.6505 - val_loss: 0.6757 - val_acc: 0.6600
Epoch 11/15
100/100 [==============================] - 7s 70ms/step - loss: 0.6618 - acc: 0.6725 - val_loss: 0.6345 - val_acc: 0.7020
Epoch 12/15
100/100 [==============================] - 7s 66ms/step - loss: 0.6402 - acc: 0.6895 - val_loss: 0.6191 - val_acc: 0.7080
Epoch 13/15
100/100 [==============================] - 7s 74ms/step - loss: 0.6308 - acc: 0.7110 - val_loss: 0.6373 - val_acc: 0.6900
Epoch 14/15
100/100 [==============================] - 7s 68ms/step - loss: 0.6052 - acc: 0.7155 - val_loss: 0.6174 - val_acc: 0.7140
Epoch 15/15
100/100 [==============================] - 7s 71ms/step - loss: 0.5855 - acc: 0.7400 - val_loss: 0.6149 - val_acc: 0.7250
In [ ]:
import matplotlib.pyplot as plt

acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(len(acc))

plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()

plt.figure()

plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()

plt.show()

Even with Regularization the there is not much improvement in accuracy.¶

In [ ]:
model.save('cats_and_dogs_small_Assign6_2_L1Reg.h5')
/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py:3000: UserWarning: You are saving your model as an HDF5 file via `model.save()`. This file format is considered legacy. We recommend using instead the native Keras format, e.g. `model.save('my_model.keras')`.
  saving_api.save_model(

Problem #2¶

In [ ]:
'''

from keras import layers
from keras import models
from keras import regularizers

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu',
                        input_shape=(150, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(512, activation='relu', kernel_regularizer=regularizers.l1 (0.0001) ))
model.add(layers.Dense(1, activation='sigmoid'))

model.summary()
'''
Out[ ]:
"\n\nfrom keras import layers\nfrom keras import models\nfrom keras import regularizers\n\nmodel = models.Sequential()\nmodel.add(layers.Conv2D(32, (3, 3), activation='relu',\n                        input_shape=(150, 150, 3)))\nmodel.add(layers.MaxPooling2D((2, 2)))\nmodel.add(layers.Conv2D(64, (3, 3), activation='relu'))\nmodel.add(layers.MaxPooling2D((2, 2)))\nmodel.add(layers.Conv2D(128, (3, 3), activation='relu'))\nmodel.add(layers.MaxPooling2D((2, 2)))\nmodel.add(layers.Conv2D(128, (3, 3), activation='relu'))\nmodel.add(layers.MaxPooling2D((2, 2)))\nmodel.add(layers.Flatten())\nmodel.add(layers.Dense(512, activation='relu', kernel_regularizer=regularizers.l1 (0.0001) ))\nmodel.add(layers.Dense(1, activation='sigmoid'))\n\nmodel.summary()\n"
In [ ]:
def data_augmentation (inputs):
  return (inputs)
In [ ]:
from keras import layers
from keras import models
from keras import regularizers
import tensorflow as tf
import tensorflow_datasets as tfds

inputs = keras.Input (shape=(150, 150,3))
x = data_augmentation (inputs)
x = layers.Rescaling (1./255)(x)
x = layers.Conv2D(filters=32, kernel_size =5, use_bias =False)(x)
for size in [32, 64, 128, 256, 512]:
  residual = x
  x = layers.BatchNormalization ()(x)
  x = layers.Activation ("relu")(x)
  x = layers.SeparableConv2D(size, 3, padding="same", use_bias =False)(x)
  x = layers.BatchNormalization ()(x)
  x = layers.Activation ("relu")(x)
  x = layers.SeparableConv2D(size, 3, padding="same", use_bias =False)(x)
  x = layers.MaxPooling2D(3, strides=2, padding="same")(x)
  residual = layers.Conv2D(size, 1, strides=2, padding="same",use_bias =False)(residual)

x = layers.add ([x, residual])
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dropout (0.5)(x)
outputs = layers.Dense (1, activation="sigmoid")(x)
model = keras.Model (inputs=inputs, outputs=outputs)

model.summary()
Model: "model"
__________________________________________________________________________________________________
 Layer (type)                Output Shape                 Param #   Connected to                  
==================================================================================================
 input_2 (InputLayer)        [(None, 150, 150, 3)]        0         []                            
                                                                                                  
 rescaling (Rescaling)       (None, 150, 150, 3)          0         ['input_2[0][0]']             
                                                                                                  
 conv2d_20 (Conv2D)          (None, 146, 146, 32)         2400      ['rescaling[0][0]']           
                                                                                                  
 batch_normalization (Batch  (None, 146, 146, 32)         128       ['conv2d_20[0][0]']           
 Normalization)                                                                                   
                                                                                                  
 activation (Activation)     (None, 146, 146, 32)         0         ['batch_normalization[0][0]'] 
                                                                                                  
 separable_conv2d (Separabl  (None, 146, 146, 32)         1312      ['activation[0][0]']          
 eConv2D)                                                                                         
                                                                                                  
 batch_normalization_1 (Bat  (None, 146, 146, 32)         128       ['separable_conv2d[0][0]']    
 chNormalization)                                                                                 
                                                                                                  
 activation_1 (Activation)   (None, 146, 146, 32)         0         ['batch_normalization_1[0][0]'
                                                                    ]                             
                                                                                                  
 separable_conv2d_1 (Separa  (None, 146, 146, 32)         1312      ['activation_1[0][0]']        
 bleConv2D)                                                                                       
                                                                                                  
 max_pooling2d_20 (MaxPooli  (None, 73, 73, 32)           0         ['separable_conv2d_1[0][0]']  
 ng2D)                                                                                            
                                                                                                  
 batch_normalization_2 (Bat  (None, 73, 73, 32)           128       ['max_pooling2d_20[0][0]']    
 chNormalization)                                                                                 
                                                                                                  
 activation_2 (Activation)   (None, 73, 73, 32)           0         ['batch_normalization_2[0][0]'
                                                                    ]                             
                                                                                                  
 separable_conv2d_2 (Separa  (None, 73, 73, 64)           2336      ['activation_2[0][0]']        
 bleConv2D)                                                                                       
                                                                                                  
 batch_normalization_3 (Bat  (None, 73, 73, 64)           256       ['separable_conv2d_2[0][0]']  
 chNormalization)                                                                                 
                                                                                                  
 activation_3 (Activation)   (None, 73, 73, 64)           0         ['batch_normalization_3[0][0]'
                                                                    ]                             
                                                                                                  
 separable_conv2d_3 (Separa  (None, 73, 73, 64)           4672      ['activation_3[0][0]']        
 bleConv2D)                                                                                       
                                                                                                  
 max_pooling2d_21 (MaxPooli  (None, 37, 37, 64)           0         ['separable_conv2d_3[0][0]']  
 ng2D)                                                                                            
                                                                                                  
 batch_normalization_4 (Bat  (None, 37, 37, 64)           256       ['max_pooling2d_21[0][0]']    
 chNormalization)                                                                                 
                                                                                                  
 activation_4 (Activation)   (None, 37, 37, 64)           0         ['batch_normalization_4[0][0]'
                                                                    ]                             
                                                                                                  
 separable_conv2d_4 (Separa  (None, 37, 37, 128)          8768      ['activation_4[0][0]']        
 bleConv2D)                                                                                       
                                                                                                  
 batch_normalization_5 (Bat  (None, 37, 37, 128)          512       ['separable_conv2d_4[0][0]']  
 chNormalization)                                                                                 
                                                                                                  
 activation_5 (Activation)   (None, 37, 37, 128)          0         ['batch_normalization_5[0][0]'
                                                                    ]                             
                                                                                                  
 separable_conv2d_5 (Separa  (None, 37, 37, 128)          17536     ['activation_5[0][0]']        
 bleConv2D)                                                                                       
                                                                                                  
 max_pooling2d_22 (MaxPooli  (None, 19, 19, 128)          0         ['separable_conv2d_5[0][0]']  
 ng2D)                                                                                            
                                                                                                  
 batch_normalization_6 (Bat  (None, 19, 19, 128)          512       ['max_pooling2d_22[0][0]']    
 chNormalization)                                                                                 
                                                                                                  
 activation_6 (Activation)   (None, 19, 19, 128)          0         ['batch_normalization_6[0][0]'
                                                                    ]                             
                                                                                                  
 separable_conv2d_6 (Separa  (None, 19, 19, 256)          33920     ['activation_6[0][0]']        
 bleConv2D)                                                                                       
                                                                                                  
 batch_normalization_7 (Bat  (None, 19, 19, 256)          1024      ['separable_conv2d_6[0][0]']  
 chNormalization)                                                                                 
                                                                                                  
 activation_7 (Activation)   (None, 19, 19, 256)          0         ['batch_normalization_7[0][0]'
                                                                    ]                             
                                                                                                  
 separable_conv2d_7 (Separa  (None, 19, 19, 256)          67840     ['activation_7[0][0]']        
 bleConv2D)                                                                                       
                                                                                                  
 max_pooling2d_23 (MaxPooli  (None, 10, 10, 256)          0         ['separable_conv2d_7[0][0]']  
 ng2D)                                                                                            
                                                                                                  
 batch_normalization_8 (Bat  (None, 10, 10, 256)          1024      ['max_pooling2d_23[0][0]']    
 chNormalization)                                                                                 
                                                                                                  
 activation_8 (Activation)   (None, 10, 10, 256)          0         ['batch_normalization_8[0][0]'
                                                                    ]                             
                                                                                                  
 separable_conv2d_8 (Separa  (None, 10, 10, 512)          133376    ['activation_8[0][0]']        
 bleConv2D)                                                                                       
                                                                                                  
 batch_normalization_9 (Bat  (None, 10, 10, 512)          2048      ['separable_conv2d_8[0][0]']  
 chNormalization)                                                                                 
                                                                                                  
 activation_9 (Activation)   (None, 10, 10, 512)          0         ['batch_normalization_9[0][0]'
                                                                    ]                             
                                                                                                  
 separable_conv2d_9 (Separa  (None, 10, 10, 512)          266752    ['activation_9[0][0]']        
 bleConv2D)                                                                                       
                                                                                                  
 max_pooling2d_24 (MaxPooli  (None, 5, 5, 512)            0         ['separable_conv2d_9[0][0]']  
 ng2D)                                                                                            
                                                                                                  
 conv2d_25 (Conv2D)          (None, 5, 5, 512)            131072    ['max_pooling2d_23[0][0]']    
                                                                                                  
 add (Add)                   (None, 5, 5, 512)            0         ['max_pooling2d_24[0][0]',    
                                                                     'conv2d_25[0][0]']           
                                                                                                  
 global_average_pooling2d (  (None, 512)                  0         ['add[0][0]']                 
 GlobalAveragePooling2D)                                                                          
                                                                                                  
 dropout (Dropout)           (None, 512)                  0         ['global_average_pooling2d[0][
                                                                    0]']                          
                                                                                                  
 dense_10 (Dense)            (None, 1)                    513       ['dropout[0][0]']             
                                                                                                  
==================================================================================================
Total params: 677825 (2.59 MB)
Trainable params: 674817 (2.57 MB)
Non-trainable params: 3008 (11.75 KB)
__________________________________________________________________________________________________
In [ ]:
from keras import optimizers

model.compile(loss='binary_crossentropy',
              optimizer=optimizers.RMSprop(lr=1e-4),
              metrics=['acc'])

history = model.fit_generator(
      train_generator,
      steps_per_epoch=100,
      epochs=60,
      validation_data=validation_generator,
      validation_steps=50)
WARNING:absl:`lr` is deprecated in Keras optimizer, please use `learning_rate` or use the legacy optimizer, e.g.,tf.keras.optimizers.legacy.RMSprop.
Epoch 1/60
<ipython-input-31-5f76e2f7e47c>:7: UserWarning: `Model.fit_generator` is deprecated and will be removed in a future version. Please use `Model.fit`, which supports generators.
  history = model.fit_generator(
100/100 [==============================] - 14s 71ms/step - loss: 0.6731 - acc: 0.5940 - val_loss: 0.6936 - val_acc: 0.5000
Epoch 2/60
100/100 [==============================] - 7s 69ms/step - loss: 0.6270 - acc: 0.6475 - val_loss: 0.7004 - val_acc: 0.5000
Epoch 3/60
100/100 [==============================] - 7s 70ms/step - loss: 0.5836 - acc: 0.7090 - val_loss: 0.7634 - val_acc: 0.5000
Epoch 4/60
100/100 [==============================] - 7s 67ms/step - loss: 0.5281 - acc: 0.7475 - val_loss: 0.8420 - val_acc: 0.5000
Epoch 5/60
100/100 [==============================] - 7s 70ms/step - loss: 0.4670 - acc: 0.7850 - val_loss: 1.0024 - val_acc: 0.5000
Epoch 6/60
100/100 [==============================] - 7s 68ms/step - loss: 0.4106 - acc: 0.8165 - val_loss: 1.0160 - val_acc: 0.5000
Epoch 7/60
100/100 [==============================] - 7s 68ms/step - loss: 0.3738 - acc: 0.8300 - val_loss: 1.1804 - val_acc: 0.5070
Epoch 8/60
100/100 [==============================] - 7s 69ms/step - loss: 0.3147 - acc: 0.8655 - val_loss: 0.7504 - val_acc: 0.7180
Epoch 9/60
100/100 [==============================] - 7s 69ms/step - loss: 0.2803 - acc: 0.8870 - val_loss: 0.6771 - val_acc: 0.7350
Epoch 10/60
100/100 [==============================] - 7s 69ms/step - loss: 0.2086 - acc: 0.9170 - val_loss: 0.7342 - val_acc: 0.7800
Epoch 11/60
100/100 [==============================] - 7s 69ms/step - loss: 0.1840 - acc: 0.9290 - val_loss: 1.2085 - val_acc: 0.7200
Epoch 12/60
100/100 [==============================] - 7s 69ms/step - loss: 0.1655 - acc: 0.9355 - val_loss: 0.7841 - val_acc: 0.7960
Epoch 13/60
100/100 [==============================] - 7s 69ms/step - loss: 0.1405 - acc: 0.9470 - val_loss: 0.7657 - val_acc: 0.7470
Epoch 14/60
100/100 [==============================] - 7s 70ms/step - loss: 0.1652 - acc: 0.9375 - val_loss: 3.4570 - val_acc: 0.6170
Epoch 15/60
100/100 [==============================] - 7s 71ms/step - loss: 0.1321 - acc: 0.9465 - val_loss: 0.7418 - val_acc: 0.7730
Epoch 16/60
100/100 [==============================] - 7s 73ms/step - loss: 0.1437 - acc: 0.9440 - val_loss: 9.1168 - val_acc: 0.5080
Epoch 17/60
100/100 [==============================] - 7s 69ms/step - loss: 0.1212 - acc: 0.9575 - val_loss: 1.4746 - val_acc: 0.7410
Epoch 18/60
100/100 [==============================] - 7s 69ms/step - loss: 0.1051 - acc: 0.9615 - val_loss: 0.7427 - val_acc: 0.8040
Epoch 19/60
100/100 [==============================] - 7s 70ms/step - loss: 0.1058 - acc: 0.9635 - val_loss: 1.1236 - val_acc: 0.8090
Epoch 20/60
100/100 [==============================] - 7s 70ms/step - loss: 0.1186 - acc: 0.9575 - val_loss: 1.0408 - val_acc: 0.7590
Epoch 21/60
100/100 [==============================] - 7s 70ms/step - loss: 0.0955 - acc: 0.9640 - val_loss: 0.8679 - val_acc: 0.7670
Epoch 22/60
100/100 [==============================] - 7s 70ms/step - loss: 0.0878 - acc: 0.9665 - val_loss: 2.6617 - val_acc: 0.6350
Epoch 23/60
100/100 [==============================] - 7s 70ms/step - loss: 0.0848 - acc: 0.9650 - val_loss: 2.1299 - val_acc: 0.7120
Epoch 24/60
100/100 [==============================] - 7s 69ms/step - loss: 0.0929 - acc: 0.9655 - val_loss: 1.3303 - val_acc: 0.7610
Epoch 25/60
100/100 [==============================] - 7s 70ms/step - loss: 0.0905 - acc: 0.9675 - val_loss: 0.8851 - val_acc: 0.7720
Epoch 26/60
100/100 [==============================] - 7s 70ms/step - loss: 0.0725 - acc: 0.9760 - val_loss: 0.8958 - val_acc: 0.8190
Epoch 27/60
100/100 [==============================] - 7s 69ms/step - loss: 0.0716 - acc: 0.9765 - val_loss: 0.8368 - val_acc: 0.8130
Epoch 28/60
100/100 [==============================] - 7s 70ms/step - loss: 0.0719 - acc: 0.9755 - val_loss: 1.3287 - val_acc: 0.7780
Epoch 29/60
100/100 [==============================] - 7s 70ms/step - loss: 0.0665 - acc: 0.9780 - val_loss: 1.8774 - val_acc: 0.7370
Epoch 30/60
100/100 [==============================] - 7s 70ms/step - loss: 0.0799 - acc: 0.9750 - val_loss: 2.1954 - val_acc: 0.6940
Epoch 31/60
100/100 [==============================] - 7s 70ms/step - loss: 0.0765 - acc: 0.9710 - val_loss: 0.9405 - val_acc: 0.7900
Epoch 32/60
100/100 [==============================] - 7s 70ms/step - loss: 0.0489 - acc: 0.9820 - val_loss: 0.8888 - val_acc: 0.8150
Epoch 33/60
100/100 [==============================] - 7s 70ms/step - loss: 0.0797 - acc: 0.9730 - val_loss: 0.7904 - val_acc: 0.8080
Epoch 34/60
100/100 [==============================] - 7s 69ms/step - loss: 0.0504 - acc: 0.9835 - val_loss: 1.2468 - val_acc: 0.7490
Epoch 35/60
100/100 [==============================] - 7s 70ms/step - loss: 0.0323 - acc: 0.9875 - val_loss: 1.4123 - val_acc: 0.7970
Epoch 36/60
100/100 [==============================] - 7s 71ms/step - loss: 0.0663 - acc: 0.9805 - val_loss: 0.8353 - val_acc: 0.7790
Epoch 37/60
100/100 [==============================] - 7s 70ms/step - loss: 0.0627 - acc: 0.9765 - val_loss: 4.2015 - val_acc: 0.6520
Epoch 38/60
100/100 [==============================] - 7s 70ms/step - loss: 0.0513 - acc: 0.9825 - val_loss: 0.9249 - val_acc: 0.8150
Epoch 39/60
100/100 [==============================] - 7s 70ms/step - loss: 0.0485 - acc: 0.9815 - val_loss: 1.1401 - val_acc: 0.7890
Epoch 40/60
100/100 [==============================] - 7s 71ms/step - loss: 0.0358 - acc: 0.9855 - val_loss: 1.0990 - val_acc: 0.8030
Epoch 41/60
100/100 [==============================] - 7s 70ms/step - loss: 0.0528 - acc: 0.9845 - val_loss: 1.7769 - val_acc: 0.7300
Epoch 42/60
100/100 [==============================] - 7s 70ms/step - loss: 0.0412 - acc: 0.9865 - val_loss: 1.2138 - val_acc: 0.8080
Epoch 43/60
100/100 [==============================] - 7s 71ms/step - loss: 0.0457 - acc: 0.9870 - val_loss: 1.1751 - val_acc: 0.8020
Epoch 44/60
100/100 [==============================] - 7s 71ms/step - loss: 0.0548 - acc: 0.9795 - val_loss: 2.1940 - val_acc: 0.7250
Epoch 45/60
100/100 [==============================] - 7s 69ms/step - loss: 0.0518 - acc: 0.9830 - val_loss: 1.0726 - val_acc: 0.8040
Epoch 46/60
100/100 [==============================] - 7s 70ms/step - loss: 0.0497 - acc: 0.9825 - val_loss: 1.3581 - val_acc: 0.7750
Epoch 47/60
100/100 [==============================] - 7s 70ms/step - loss: 0.0267 - acc: 0.9910 - val_loss: 1.0290 - val_acc: 0.8170
Epoch 48/60
100/100 [==============================] - 7s 70ms/step - loss: 0.0598 - acc: 0.9815 - val_loss: 1.4211 - val_acc: 0.7370
Epoch 49/60
100/100 [==============================] - 7s 71ms/step - loss: 0.0297 - acc: 0.9910 - val_loss: 3.1233 - val_acc: 0.7220
Epoch 50/60
100/100 [==============================] - 7s 69ms/step - loss: 0.0335 - acc: 0.9910 - val_loss: 1.1390 - val_acc: 0.8270
Epoch 51/60
100/100 [==============================] - 7s 70ms/step - loss: 0.0453 - acc: 0.9870 - val_loss: 0.7916 - val_acc: 0.8640
Epoch 52/60
100/100 [==============================] - 7s 70ms/step - loss: 0.0481 - acc: 0.9855 - val_loss: 1.9504 - val_acc: 0.7880
Epoch 53/60
100/100 [==============================] - 7s 71ms/step - loss: 0.0481 - acc: 0.9825 - val_loss: 0.8907 - val_acc: 0.8240
Epoch 54/60
100/100 [==============================] - 7s 70ms/step - loss: 0.0479 - acc: 0.9835 - val_loss: 0.8707 - val_acc: 0.8370
Epoch 55/60
100/100 [==============================] - 7s 69ms/step - loss: 0.0343 - acc: 0.9870 - val_loss: 2.0437 - val_acc: 0.7370
Epoch 56/60
100/100 [==============================] - 7s 70ms/step - loss: 0.0511 - acc: 0.9835 - val_loss: 1.4708 - val_acc: 0.7970
Epoch 57/60
100/100 [==============================] - 7s 69ms/step - loss: 0.0395 - acc: 0.9880 - val_loss: 1.3706 - val_acc: 0.7530
Epoch 58/60
100/100 [==============================] - 7s 71ms/step - loss: 0.0361 - acc: 0.9885 - val_loss: 2.1106 - val_acc: 0.7450
Epoch 59/60
100/100 [==============================] - 7s 71ms/step - loss: 0.0187 - acc: 0.9940 - val_loss: 3.2414 - val_acc: 0.7070
Epoch 60/60
100/100 [==============================] - 7s 73ms/step - loss: 0.0428 - acc: 0.9825 - val_loss: 1.3865 - val_acc: 0.7810
In [ ]:
import matplotlib.pyplot as plt

acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(len(acc))

plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()

plt.figure()

plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()

plt.show()
In [ ]:
model.save('cats_and_dogs_small_Assign6_3_SepConv2D.h5')
/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py:3000: UserWarning: You are saving your model as an HDF5 file via `model.save()`. This file format is considered legacy. We recommend using instead the native Keras format, e.g. `model.save('my_model.keras')`.
  saving_api.save_model(

The validation accuracy went up to 86.4% with the SeparableConv2D replacement from the original (only Conv2D withOUT regularization) 76.7%. This shows that Separable CONV2D leads to 10% increase in accuracy which is a big deal. There is still some overfitting.¶

The # of Trainable params decreased to 674,817 from the original¶

Trainable params: 3,453,121 - which is good as with fewer parameters we are getting a 10% accuracy lift.

Using data augmentation¶

Overfitting is caused by having too few samples to learn from, rendering us unable to train a model able to generalize to new data. Given infinite data, our model would be exposed to every possible aspect of the data distribution at hand: we would never overfit. Data augmentation takes the approach of generating more training data from existing training samples, by "augmenting" the samples via a number of random transformations that yield believable-looking images. The goal is that at training time, our model would never see the exact same picture twice. This helps the model get exposed to more aspects of the data and generalize better.

In Keras, this can be done by configuring a number of random transformations to be performed on the images read by our ImageDataGenerator instance. Let's get started with an example:

In [ ]:
datagen = ImageDataGenerator(
      rotation_range=40,
      width_shift_range=0.2,
      height_shift_range=0.2,
      shear_range=0.2,
      zoom_range=0.2,
      horizontal_flip=True,
      fill_mode='nearest')

These are just a few of the options available (for more, see the Keras documentation). Let's quickly go over what we just wrote:

  • rotation_range is a value in degrees (0-180), a range within which to randomly rotate pictures.
  • width_shift and height_shift are ranges (as a fraction of total width or height) within which to randomly translate pictures vertically or horizontally.
  • shear_range is for randomly applying shearing transformations.
  • zoom_range is for randomly zooming inside pictures.
  • horizontal_flip is for randomly flipping half of the images horizontally -- relevant when there are no assumptions of horizontal asymmetry (e.g. real-world pictures).
  • fill_mode is the strategy used for filling in newly created pixels, which can appear after a rotation or a width/height shift.

Let's take a look at our augmented images:

If we train a new network using this data augmentation configuration, our network will never see twice the same input. However, the inputs that it sees are still heavily intercorrelated, since they come from a small number of original images -- we cannot produce new information, we can only remix existing information. As such, this might not be quite enough to completely get rid of overfitting. To further fight overfitting, we will also add a Dropout layer to our model, right before the densely-connected classifier:

In [ ]:
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu',
                        input_shape=(150, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dropout(0.5))
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer=optimizers.RMSprop(lr=1e-4),
              metrics=['acc'])

Let's train our network using data augmentation and dropout:

In [ ]:
train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,)

# Note that the validation data should not be augmented!
test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
        # This is the target directory
        train_dir,
        # All images will be resized to 150x150
        target_size=(150, 150),
        batch_size=32,
        # Since we use binary_crossentropy loss, we need binary labels
        class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
        validation_dir,
        target_size=(150, 150),
        batch_size=32,
        class_mode='binary')

history = model.fit_generator(
      train_generator,
      steps_per_epoch=100,
      epochs=100,
      validation_data=validation_generator,
      validation_steps=50)
Found 2000 images belonging to 2 classes.
Found 1000 images belonging to 2 classes.
Epoch 1/100
100/100 [==============================] - 24s - loss: 0.6857 - acc: 0.5447 - val_loss: 0.6620 - val_acc: 0.5888
Epoch 2/100
100/100 [==============================] - 23s - loss: 0.6710 - acc: 0.5675 - val_loss: 0.6606 - val_acc: 0.5825
Epoch 3/100
100/100 [==============================] - 22s - loss: 0.6609 - acc: 0.5913 - val_loss: 0.6663 - val_acc: 0.5711.594 - ETA: 7s - loss: 0.6655 - ETA: 5s - los - ETA: 1s - loss: 0.6620 - acc: 
Epoch 4/100
100/100 [==============================] - 22s - loss: 0.6446 - acc: 0.6178 - val_loss: 0.6200 - val_acc: 0.6379
Epoch 5/100
100/100 [==============================] - 22s - loss: 0.6267 - acc: 0.6325 - val_loss: 0.6280 - val_acc: 0.5996
Epoch 6/100
100/100 [==============================] - 22s - loss: 0.6080 - acc: 0.6631 - val_loss: 0.6841 - val_acc: 0.5490
Epoch 7/100
100/100 [==============================] - 22s - loss: 0.5992 - acc: 0.6700 - val_loss: 0.5717 - val_acc: 0.6946
Epoch 8/100
100/100 [==============================] - 22s - loss: 0.5908 - acc: 0.6819 - val_loss: 0.5858 - val_acc: 0.6764
Epoch 9/100
100/100 [==============================] - 22s - loss: 0.5869 - acc: 0.6856 - val_loss: 0.5658 - val_acc: 0.6785
Epoch 10/100
100/100 [==============================] - 23s - loss: 0.5692 - acc: 0.6934 - val_loss: 0.5409 - val_acc: 0.7170
Epoch 11/100
100/100 [==============================] - 22s - loss: 0.5708 - acc: 0.6897 - val_loss: 0.5325 - val_acc: 0.7274
Epoch 12/100
100/100 [==============================] - 23s - loss: 0.5583 - acc: 0.7047 - val_loss: 0.5683 - val_acc: 0.7126
Epoch 13/100
100/100 [==============================] - 22s - loss: 0.5602 - acc: 0.7069 - val_loss: 0.6010 - val_acc: 0.6593
Epoch 14/100
100/100 [==============================] - 22s - loss: 0.5510 - acc: 0.7231 - val_loss: 0.5387 - val_acc: 0.7229
Epoch 15/100
100/100 [==============================] - 23s - loss: 0.5527 - acc: 0.7175 - val_loss: 0.5204 - val_acc: 0.7322
Epoch 16/100
100/100 [==============================] - 23s - loss: 0.5426 - acc: 0.7181 - val_loss: 0.5083 - val_acc: 0.7410
Epoch 17/100
100/100 [==============================] - 23s - loss: 0.5399 - acc: 0.7344 - val_loss: 0.5103 - val_acc: 0.7468
Epoch 18/100
100/100 [==============================] - 23s - loss: 0.5375 - acc: 0.7312 - val_loss: 0.5133 - val_acc: 0.7430
Epoch 19/100
100/100 [==============================] - 22s - loss: 0.5308 - acc: 0.7338 - val_loss: 0.4936 - val_acc: 0.7610
Epoch 20/100
100/100 [==============================] - 22s - loss: 0.5225 - acc: 0.7387 - val_loss: 0.4952 - val_acc: 0.7563
Epoch 21/100
100/100 [==============================] - 22s - loss: 0.5180 - acc: 0.7491 - val_loss: 0.4999 - val_acc: 0.7481
Epoch 22/100
100/100 [==============================] - 23s - loss: 0.5118 - acc: 0.7538 - val_loss: 0.4770 - val_acc: 0.7764
Epoch 23/100
100/100 [==============================] - 22s - loss: 0.5245 - acc: 0.7378 - val_loss: 0.4929 - val_acc: 0.7671
Epoch 24/100
100/100 [==============================] - 22s - loss: 0.5136 - acc: 0.7503 - val_loss: 0.4709 - val_acc: 0.7732
Epoch 25/100
100/100 [==============================] - 22s - loss: 0.4980 - acc: 0.7512 - val_loss: 0.4775 - val_acc: 0.7684
Epoch 26/100
100/100 [==============================] - 22s - loss: 0.4875 - acc: 0.7622 - val_loss: 0.4745 - val_acc: 0.7790
Epoch 27/100
100/100 [==============================] - 22s - loss: 0.5044 - acc: 0.7578 - val_loss: 0.5000 - val_acc: 0.7403
Epoch 28/100
100/100 [==============================] - 22s - loss: 0.4948 - acc: 0.7603 - val_loss: 0.4619 - val_acc: 0.7754
Epoch 29/100
100/100 [==============================] - 22s - loss: 0.4898 - acc: 0.7578 - val_loss: 0.4730 - val_acc: 0.7726
Epoch 30/100
100/100 [==============================] - 22s - loss: 0.4808 - acc: 0.7691 - val_loss: 0.4599 - val_acc: 0.7716
Epoch 31/100
100/100 [==============================] - 22s - loss: 0.4792 - acc: 0.7678 - val_loss: 0.4671 - val_acc: 0.7790
Epoch 32/100
100/100 [==============================] - 22s - loss: 0.4723 - acc: 0.7716 - val_loss: 0.4451 - val_acc: 0.7849
Epoch 33/100
100/100 [==============================] - 22s - loss: 0.4750 - acc: 0.7694 - val_loss: 0.4827 - val_acc: 0.7665
Epoch 34/100
100/100 [==============================] - 22s - loss: 0.4816 - acc: 0.7647 - val_loss: 0.4953 - val_acc: 0.7513
Epoch 35/100
100/100 [==============================] - 22s - loss: 0.4598 - acc: 0.7813 - val_loss: 0.4426 - val_acc: 0.7843
Epoch 36/100
100/100 [==============================] - 23s - loss: 0.4643 - acc: 0.7781 - val_loss: 0.4692 - val_acc: 0.7680
Epoch 37/100
100/100 [==============================] - 22s - loss: 0.4675 - acc: 0.7778 - val_loss: 0.4849 - val_acc: 0.7633
Epoch 38/100
100/100 [==============================] - 22s - loss: 0.4658 - acc: 0.7737 - val_loss: 0.4632 - val_acc: 0.7760
Epoch 39/100
100/100 [==============================] - 22s - loss: 0.4581 - acc: 0.7866 - val_loss: 0.4489 - val_acc: 0.7880
Epoch 40/100
100/100 [==============================] - 23s - loss: 0.4485 - acc: 0.7856 - val_loss: 0.4479 - val_acc: 0.7931
Epoch 41/100
100/100 [==============================] - 22s - loss: 0.4637 - acc: 0.7759 - val_loss: 0.4453 - val_acc: 0.7990
Epoch 42/100
100/100 [==============================] - 22s - loss: 0.4528 - acc: 0.7841 - val_loss: 0.4758 - val_acc: 0.7868
Epoch 43/100
100/100 [==============================] - 22s - loss: 0.4481 - acc: 0.7856 - val_loss: 0.4472 - val_acc: 0.7893
Epoch 44/100
100/100 [==============================] - 22s - loss: 0.4540 - acc: 0.7953 - val_loss: 0.4366 - val_acc: 0.7867A: 6s - loss: 0.4523 - acc: - ETA: 
Epoch 45/100
100/100 [==============================] - 22s - loss: 0.4411 - acc: 0.7919 - val_loss: 0.4708 - val_acc: 0.7697
Epoch 46/100
100/100 [==============================] - 22s - loss: 0.4493 - acc: 0.7869 - val_loss: 0.4366 - val_acc: 0.7829
Epoch 47/100
100/100 [==============================] - 22s - loss: 0.4436 - acc: 0.7916 - val_loss: 0.4307 - val_acc: 0.8090
Epoch 48/100
100/100 [==============================] - 22s - loss: 0.4391 - acc: 0.7928 - val_loss: 0.4203 - val_acc: 0.8065
Epoch 49/100
100/100 [==============================] - 23s - loss: 0.4284 - acc: 0.8053 - val_loss: 0.4422 - val_acc: 0.8041
Epoch 50/100
100/100 [==============================] - 22s - loss: 0.4492 - acc: 0.7906 - val_loss: 0.5422 - val_acc: 0.7437
Epoch 51/100
100/100 [==============================] - 22s - loss: 0.4292 - acc: 0.7953 - val_loss: 0.4446 - val_acc: 0.7932
Epoch 52/100
100/100 [==============================] - 22s - loss: 0.4275 - acc: 0.8037 - val_loss: 0.4287 - val_acc: 0.7989
Epoch 53/100
100/100 [==============================] - 22s - loss: 0.4297 - acc: 0.7975 - val_loss: 0.4091 - val_acc: 0.8046
Epoch 54/100
100/100 [==============================] - 23s - loss: 0.4198 - acc: 0.7978 - val_loss: 0.4413 - val_acc: 0.7964
Epoch 55/100
100/100 [==============================] - 23s - loss: 0.4195 - acc: 0.8019 - val_loss: 0.4265 - val_acc: 0.8001
Epoch 56/100
100/100 [==============================] - 22s - loss: 0.4081 - acc: 0.8056 - val_loss: 0.4374 - val_acc: 0.7957
Epoch 57/100
100/100 [==============================] - 22s - loss: 0.4214 - acc: 0.8006 - val_loss: 0.4228 - val_acc: 0.8020
Epoch 58/100
100/100 [==============================] - 22s - loss: 0.4050 - acc: 0.8097 - val_loss: 0.4332 - val_acc: 0.7900
Epoch 59/100
100/100 [==============================] - 22s - loss: 0.4162 - acc: 0.8134 - val_loss: 0.4088 - val_acc: 0.8099
Epoch 60/100
100/100 [==============================] - 22s - loss: 0.4042 - acc: 0.8141 - val_loss: 0.4436 - val_acc: 0.7957
Epoch 61/100
100/100 [==============================] - 23s - loss: 0.4016 - acc: 0.8212 - val_loss: 0.4082 - val_acc: 0.8189
Epoch 62/100
100/100 [==============================] - 22s - loss: 0.4167 - acc: 0.8097 - val_loss: 0.3935 - val_acc: 0.8236
Epoch 63/100
100/100 [==============================] - 23s - loss: 0.4052 - acc: 0.8138 - val_loss: 0.4509 - val_acc: 0.7824
Epoch 64/100
100/100 [==============================] - 22s - loss: 0.4011 - acc: 0.8209 - val_loss: 0.3874 - val_acc: 0.8299
Epoch 65/100
100/100 [==============================] - 22s - loss: 0.3966 - acc: 0.8131 - val_loss: 0.4328 - val_acc: 0.7970
Epoch 66/100
100/100 [==============================] - 23s - loss: 0.3889 - acc: 0.8163 - val_loss: 0.4766 - val_acc: 0.7719
Epoch 67/100
100/100 [==============================] - 22s - loss: 0.3960 - acc: 0.8163 - val_loss: 0.3859 - val_acc: 0.8325
Epoch 68/100
100/100 [==============================] - 22s - loss: 0.3893 - acc: 0.8231 - val_loss: 0.4172 - val_acc: 0.8128
Epoch 69/100
100/100 [==============================] - 23s - loss: 0.3828 - acc: 0.8219 - val_loss: 0.4023 - val_acc: 0.8215 loss: 0.3881 - acc:
Epoch 70/100
100/100 [==============================] - 22s - loss: 0.3909 - acc: 0.8275 - val_loss: 0.4275 - val_acc: 0.8008
Epoch 71/100
100/100 [==============================] - 22s - loss: 0.3826 - acc: 0.8244 - val_loss: 0.3815 - val_acc: 0.8177
Epoch 72/100
100/100 [==============================] - 22s - loss: 0.3837 - acc: 0.8272 - val_loss: 0.4040 - val_acc: 0.8287
Epoch 73/100
100/100 [==============================] - 23s - loss: 0.3812 - acc: 0.8222 - val_loss: 0.4039 - val_acc: 0.8058
Epoch 74/100
100/100 [==============================] - 22s - loss: 0.3829 - acc: 0.8281 - val_loss: 0.4204 - val_acc: 0.8015
Epoch 75/100
100/100 [==============================] - 22s - loss: 0.3708 - acc: 0.8350 - val_loss: 0.4083 - val_acc: 0.8204
Epoch 76/100
100/100 [==============================] - 22s - loss: 0.3831 - acc: 0.8216 - val_loss: 0.3899 - val_acc: 0.8215
Epoch 77/100
100/100 [==============================] - 22s - loss: 0.3695 - acc: 0.8375 - val_loss: 0.3963 - val_acc: 0.8293
Epoch 78/100
100/100 [==============================] - 22s - loss: 0.3809 - acc: 0.8234 - val_loss: 0.4046 - val_acc: 0.8236
Epoch 79/100
100/100 [==============================] - 22s - loss: 0.3637 - acc: 0.8362 - val_loss: 0.3990 - val_acc: 0.8325
Epoch 80/100
100/100 [==============================] - 22s - loss: 0.3596 - acc: 0.8400 - val_loss: 0.3925 - val_acc: 0.8350
Epoch 81/100
100/100 [==============================] - 22s - loss: 0.3762 - acc: 0.8303 - val_loss: 0.3813 - val_acc: 0.8331
Epoch 82/100
100/100 [==============================] - 23s - loss: 0.3672 - acc: 0.8347 - val_loss: 0.4539 - val_acc: 0.7931
Epoch 83/100
100/100 [==============================] - 22s - loss: 0.3636 - acc: 0.8353 - val_loss: 0.3988 - val_acc: 0.8261
Epoch 84/100
100/100 [==============================] - 22s - loss: 0.3503 - acc: 0.8453 - val_loss: 0.3987 - val_acc: 0.8325
Epoch 85/100
100/100 [==============================] - 22s - loss: 0.3586 - acc: 0.8437 - val_loss: 0.3842 - val_acc: 0.8306
Epoch 86/100
100/100 [==============================] - 22s - loss: 0.3624 - acc: 0.8353 - val_loss: 0.4100 - val_acc: 0.8196.834
Epoch 87/100
100/100 [==============================] - 22s - loss: 0.3596 - acc: 0.8422 - val_loss: 0.3814 - val_acc: 0.8331
Epoch 88/100
100/100 [==============================] - 22s - loss: 0.3487 - acc: 0.8494 - val_loss: 0.4266 - val_acc: 0.8109
Epoch 89/100
100/100 [==============================] - 22s - loss: 0.3598 - acc: 0.8400 - val_loss: 0.4076 - val_acc: 0.8325
Epoch 90/100
100/100 [==============================] - 22s - loss: 0.3510 - acc: 0.8450 - val_loss: 0.3762 - val_acc: 0.8388
Epoch 91/100
100/100 [==============================] - 22s - loss: 0.3458 - acc: 0.8450 - val_loss: 0.4684 - val_acc: 0.8015
Epoch 92/100
100/100 [==============================] - 22s - loss: 0.3454 - acc: 0.8441 - val_loss: 0.4017 - val_acc: 0.8204
Epoch 93/100
100/100 [==============================] - 22s - loss: 0.3402 - acc: 0.8487 - val_loss: 0.3928 - val_acc: 0.8204
Epoch 94/100
100/100 [==============================] - 22s - loss: 0.3569 - acc: 0.8394 - val_loss: 0.4005 - val_acc: 0.8338
Epoch 95/100
100/100 [==============================] - 22s - loss: 0.3425 - acc: 0.8494 - val_loss: 0.3641 - val_acc: 0.8439
Epoch 96/100
100/100 [==============================] - 22s - loss: 0.3335 - acc: 0.8531 - val_loss: 0.3811 - val_acc: 0.8363
Epoch 97/100
100/100 [==============================] - 22s - loss: 0.3204 - acc: 0.8581 - val_loss: 0.3786 - val_acc: 0.8331
Epoch 98/100
100/100 [==============================] - 22s - loss: 0.3250 - acc: 0.8606 - val_loss: 0.4205 - val_acc: 0.8236
Epoch 99/100
100/100 [==============================] - 22s - loss: 0.3255 - acc: 0.8581 - val_loss: 0.3518 - val_acc: 0.8460
Epoch 100/100
100/100 [==============================] - 22s - loss: 0.3280 - acc: 0.8491 - val_loss: 0.3776 - val_acc: 0.8439

Let's save our model -- we will be using it in the section on convnet visualization.

In [ ]:
model.save('cats_and_dogs_small_2.h5')

Let's plot our results again:

In [ ]:
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(len(acc))

plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()

plt.figure()

plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()

plt.show()

Thanks to data augmentation and dropout, we are no longer overfitting: the training curves are rather closely tracking the validation curves. We are now able to reach an accuracy of 82%, a 15% relative improvement over the non-regularized model.

By leveraging regularization techniques even further and by tuning the network's parameters (such as the number of filters per convolution layer, or the number of layers in the network), we may be able to get an even better accuracy, likely up to 86-87%. However, it would prove very difficult to go any higher just by training our own convnet from scratch, simply because we have so little data to work with. As a next step to improve our accuracy on this problem, we will have to leverage a pre-trained model, which will be the focus of the next two sections.

Problem 3¶

In [ ]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import matplotlib.pyplot as plt
from tensorflow.keras.preprocessing import image
import cv2
from google.colab import drive

drive.mount("/content/gdrive")
Mounted at /content/gdrive
In [ ]:
### uncomment these to see different examples of ImageDataGenerator
# datagen = ImageDataGenerator(brightness_range=[10.0, 15.5])
# datagen = ImageDataGenerator(zoom_range=0.7)
datagen = ImageDataGenerator(rotation_range=90)
# datagen = ImageDataGenerator(horizontal_flip=True)

img_path = '/content/gdrive/MyDrive/ColabNotebooks/Lion.jpeg'

img = image.load_img(img_path)
x = image.img_to_array(img)
print(x.shape)

plt.imshow(image.array_to_img(x))
plt.show()
print("Original Image")


x = x.reshape((1,) + x.shape)
print(x.shape)
i=0
for batch in datagen.flow(x, batch_size=1):
    plt.figure(i)
    imgplot = plt.imshow(image.array_to_img(batch[0]))
    i += 1
    if i % 2 == 0:
        break
plt.show()
(168, 300, 3)
Original Image
(1, 168, 300, 3)

Width_Shift

In [ ]:
### uncomment these to see different examples of ImageDataGenerator
# datagen = ImageDataGenerator(brightness_range=[10.0, 15.5])
# datagen = ImageDataGenerator(zoom_range=0.7)
datagen = ImageDataGenerator(width_shift_range=300)
# datagen = ImageDataGenerator(horizontal_flip=True)

img_path = '/content/gdrive/MyDrive/ColabNotebooks/Lion.jpeg'

img = image.load_img(img_path)
x = image.img_to_array(img)
print(x.shape)

plt.imshow(image.array_to_img(x))
plt.show()
print("Original Image")


x = x.reshape((1,) + x.shape)
print(x.shape)
i=0
for batch in datagen.flow(x, batch_size=1):
    plt.figure(i)
    imgplot = plt.imshow(image.array_to_img(batch[0]))
    i += 1
    if i % 2 == 0:
        break
plt.show()
(168, 300, 3)
Original Image
(1, 168, 300, 3)

Shear_range

In [ ]:
### uncomment these to see different examples of ImageDataGenerator
# datagen = ImageDataGenerator(brightness_range=[10.0, 15.5])
# datagen = ImageDataGenerator(zoom_range=0.7)
datagen = ImageDataGenerator(shear_range=90)
# datagen = ImageDataGenerator(horizontal_flip=True)

img_path = '/content/gdrive/MyDrive/ColabNotebooks/Lion.jpeg'

img = image.load_img(img_path)
x = image.img_to_array(img)
print(x.shape)

plt.imshow(image.array_to_img(x))
plt.show()
print("Original Image")


x = x.reshape((1,) + x.shape)
print(x.shape)
i=0
for batch in datagen.flow(x, batch_size=1):
    plt.figure(i)
    imgplot = plt.imshow(image.array_to_img(batch[0]))
    i += 1
    if i % 2 == 0:
        break
plt.show()
(168, 300, 3)
Original Image
(1, 168, 300, 3)

zoom_range

In [ ]:
### uncomment these to see different examples of ImageDataGenerator
# datagen = ImageDataGenerator(brightness_range=[10.0, 15.5])
datagen = ImageDataGenerator(zoom_range=0.7)
#datagen = ImageDataGenerator(rotation_range=90)
# datagen = ImageDataGenerator(horizontal_flip=True)

img_path = '/content/gdrive/MyDrive/ColabNotebooks/Lion.jpeg'

img = image.load_img(img_path)
x = image.img_to_array(img)
print(x.shape)

plt.imshow(image.array_to_img(x))
plt.show()
print("Original Image")


x = x.reshape((1,) + x.shape)
print(x.shape)
i=0
for batch in datagen.flow(x, batch_size=1):
    plt.figure(i)
    imgplot = plt.imshow(image.array_to_img(batch[0]))
    i += 1
    if i % 2 == 0:
        break
plt.show()
(168, 300, 3)
Original Image
(1, 168, 300, 3)

vertical_flip

In [ ]:
### uncomment these to see different examples of ImageDataGenerator
# datagen = ImageDataGenerator(brightness_range=[10.0, 15.5])
# datagen = ImageDataGenerator(zoom_range=0.7)
#datagen = ImageDataGenerator(rotation_range=90)
datagen = ImageDataGenerator(vertical_flip=True)

img_path = '/content/gdrive/MyDrive/ColabNotebooks/Lion.jpeg'

img = image.load_img(img_path)
x = image.img_to_array(img)
print(x.shape)

plt.imshow(image.array_to_img(x))
plt.show()
print("Original Image")


x = x.reshape((1,) + x.shape)
print(x.shape)
i=0
for batch in datagen.flow(x, batch_size=1):
    plt.figure(i)
    imgplot = plt.imshow(image.array_to_img(batch[0]))
    print(i)
    i += 1
    if i % 2 == 0:
        break
plt.show()
(168, 300, 3)
Original Image
(1, 168, 300, 3)

horizontal_flip

In [ ]:
### uncomment these to see different examples of ImageDataGenerator
# datagen = ImageDataGenerator(brightness_range=[10.0, 15.5])
# datagen = ImageDataGenerator(zoom_range=0.7)
#datagen = ImageDataGenerator(rotation_range=90)
datagen = ImageDataGenerator(horizontal_flip=True)

img_path = '/content/gdrive/MyDrive/ColabNotebooks/Lion.jpeg'

img = image.load_img(img_path)
x = image.img_to_array(img)
print(x.shape)

plt.imshow(image.array_to_img(x))
plt.show()
print("Original Image")


x = x.reshape((1,) + x.shape)
print(x.shape)
i=0
for batch in datagen.flow(x, batch_size=1):
    plt.figure(i)
    imgplot = plt.imshow(image.array_to_img(batch[0]))
    i += 1
    if i % 2 == 0:
        break
plt.show()
(168, 300, 3)
Original Image
(1, 168, 300, 3)